Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion
نویسندگان
چکیده
In this paper, we evaluate our proposed singing voice conversion method from various perspectives. To enable singers to freely control their voice timbre of singing voice, we have proposed a singing voice conversion method based on many-tomany eigenvoice conversion (EVC) that enables to convert the voice timbre of an arbitrary source singer into that of another arbitrary target singer using a probabilistic model. Furthermore, to easily develop training data consisting of multiple parallel data sets between a single reference singer and many other singers, a technique for efficiently and effectively generating the parallel data sets from nonparallel singing voice data sets of many singers using a singing-to-singing synthesis system have been proposed. However, we have never conducted sufficient investigations into the effectiveness of these proposed methods. In this paper, we conduct both objective and subjective evaluations to carefully investigate the effectiveness of proposed methods. Moreover, the differences between singing voice conversion and speaking voice conversion are also analyzed. Experimental results show that our proposed method succeeds in enabling people to control their own voice timbre by using only an extremely small amount of the target singing voice.
منابع مشابه
Maximum a posteriori adaptation for many-to-one eigenvoice conversion
Many-to-one eigenvoice conversion (EVC) allows the conversion from an arbitrary speaker’s voice into the pre-determined target speaker’s voice. In this method, a canonical eigenvoice Gaussian mixture model is effectively adapted to any source speaker using only a few utterances as the adaptation data. In this paper, we propose a many-to-one EVC based on maximum a posteriori (MAP) adaptation for...
متن کاملDoctoral Thesis Techniques for Improving Voice Conversion Based on Eigenvoices
Voice conversion (VC) is a technique for converting a source speaker’s voice into another speaker’s voice without changing linguistic information. As a typical approach to VC, a statistical method based on Gaussian mixture model (GMM) is used widely. A GMM is trained as a conversion model using a parallel data set composed of many utterance-pairs of source and target speakers. Although this fra...
متن کاملEigenvoice-based Approach to Voice Conversion and Voice Quality Control
This paper reviews our proposed approach to voice conversion (VC) and voice quality control based on an eigenvoice technique. VC is a technique to modify nonlinguistic information such as speaker individuality while keeping linguistic information unchanged. In the traditional VC framework, a conversion model for a source and target speaker-pair needs to be trained in advance using a parallel da...
متن کاملEffects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion
This paper introduces speaker adaptive training techniques to tensor-based arbitrary speaker conversion. In voice conversion studies, realization of conversion from/to an arbitrary speaker’s voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC), which is based on an eigenvoice Gaussian mixture model (EV-GMM), was proposed. Although the EVC can effectively const...
متن کاملEsophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
This paper presents a novel method of enhancing esophageal speech using statistical voice conversion. Esophageal speech is one of the alternative speaking methods for laryngectomees. Although it doesn’t require any external devices, generated voices usually sound unnatural compared with normal speech. To improve the intelligibility and naturalness of esophageal speech, we propose a voice conver...
متن کامل